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Abstract. The theory of (tight) wavelet frames has been extensively studied in the past twenty years and 
they arc currently widely used for image restoration and other image processing and analysis problems. The 
success of wavelet frame based models, including balanced approach |18l [7] and analysis based approach 
1111 1311 1501 . is due to their capability of sparsely approximating piecewise smooth functions like images. 
Motivated by the balanced approach and analysis based approach, we shall propose a wavelet frame based 
Iq minimization model, where the Iq "norm" of the frame coefficients is penalized. We adapt the penalty 
decomposition (PD) method of [40] to solve the proposed optimization problem. Some convergence analysis 
of the adapted PD method will also be provided. Numerical results showed that the proposed model solved 
by the PD method can generate images with better quality than those obtained by either analysis based 
approach or balanced approach in terms of restoring sharp features as well as maintaining smoothness of 
the recovered images. 

1. Introduction 

Mathematics has been playing an important role in the modern developments of image processing and 
analysis. Image restoration, including image denoising, dcblurring, inpainting, tomography, etc., is one of 
the most important areas in image processing and analysis. Its major purpose is to enhance the quality of a 
given image that is corrupted in various ways during the process of imaging, acquisition and communication; 
and enable us to see crucial but subtle objects residing in the image. Therefore, image restoration is an 
important step to take towards accurate interpretations of the physical world and making optimal decisions. 

1.1. Image Restoration. Image restoration is often formulated as a linear inverse problem. For the sim- 
plicity of the notations, we denote the images as vectors in K™ with n equals to the total number of pixels. 
A typical image restoration problem is formulated as 

(1.1) f = Au + V , 

where / € ~BL d is the observed image (or measurements), r\ denotes white Gaussian noise with variance a 2 , 
and A £ M. dxn is some linear operator. The objective is to find the unknown true image u € W 1 from the 
observed image /. Typically, the linear operator in is a convolution operator for image dcconvolution 
problems, a projection operator for image inpainting and partial Radon transform for computed tomography. 
To solve u from one of the most natural choices is the following least square problem 



min 1 1 Au — f I 
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where || ■ H2 denotes the ^ 2 - norm - This is, however, not a good idea in general. Taking image dcconvolution 
problem as an example, since the matrix A is ill-conditioned, the noise r\ possessed by / will be amplified after 
solving the above least squares problem. Therefore, in order to suppress the effect of noise and also preserve 
key features of the image, e.g., edges, various regularization based optimization models were proposed in the 
literature. Among all regularization based models for image restoration, variational methods and wavelet 
frames based approaches are widely adopted and have been proven successful. 

The trend of variational methods and partial differential equation (PDE) based image processing started 
with the refined Rudin-Osher-Fatemi (ROF) model [55] which penalizes the total variation (TV) of u. Many 
of the current PDE based methods for image denoising and decomposition utilize TV regularization for its 
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beneficial edge preserving property (see e.g., [HI [23 E2] ) • ^ ne ROF model is especially effective on restoring 
images that are piecewise constant, e.g., binary images. Other types of variational models were also proposed 
after the ROF model. We refer the interested readers to [36l [fll HH g2l [22j [2 [23l [53] and the references 
therein for more details. 

Wavelet frame based approaches are relatively new and came from a different path. The basic idea 
for wavelet frame based approaches is that images can be sparsely approximated by properly designed 
wavelet frames, and hence, the regularization used for wavelet frame based models is the ^i-norm of frame 
coefficients. Although wavelet frame based approaches take similar forms as variational methods, they 
were generally considered as different approaches than variational methods because, among many other 
reasons, wavelet frame based approaches is defined for discrete data, while variational methods assume all 
variables are functions. Some studies in the literature (see for example |51j ) indicated that there was a 
relation between Haar wavelet and total variation. However, it was not clear if there exists a general relation 
between wavelet frames and variational models (with general differential operators) in the context of image 
restorations. In a recent paper [9], the authors established a rigorous connection between one of the wavelet 
frame based approaches, namely the analysis based approach, and variational models. It was shown in [9] 
that the analysis based approach can be regarded as a finite difference approximation of a certain type of 
general variational model, and such approximation will be exact when image resolution goes to infinity. 
Furthermore, through Gamma-convergence, the authors showed that the solutions of the analysis based 
approach also approximate the solutions of the corresponding variational model. Such connections not only 
grant geometric interpretation to wavelet frame based approaches, but also lead to even wider applications 
of them, e.g., image segmentation |27| and 3D surface reconstruction from unorganized point sets |29| . On 
the other hand, the discretization provided by wavelet frames was shown, in e.g., [HI [20] [TO] HU El [28] , to 
be superior than the standard discretizations for some of the variational models, due to the multiresolution 
structure and redundancy of wavelet frames which enable wavelet frame based models to adaptively choose 
a proper differential operators in different regions of a given image according to the order of the singularity 
of the underlying solutions. For these reasons, as well as the fact that digital images are always discrete, we 
use wavelet frames as the tool for image restoration in this paper. 

1.2. Wavelet Frame Based Approaches. We now briefly introduce the concept of tight frames and 
tight wavelet frame, and then recall some of the frame based image restoration models. Interesting readers 
should consult [44] [24] [25] for theories of frames and wavelet frames, [47] for a short survey on theory and 
applications of frames, and |28j for a more detailed survey. 

A countable set X C is called a tight frame of Li($) if 

f=J2(f,h)h V/6L,(R), 

where (•, •) is the inner product of L^M). The tight frame X is called a tight wavelet frame if the elements of 
X are generated by dilations and translations of finitely many functions called framelets. The construction 
of framelets can be obtained by the unitary extension principle (UEP) of |44j . In our implementations, we 
will mainly use the piecewise linear B-splinc framelets constructed by |44j . Given a 1-dimensional framelet 
system for the s-dimensional tight wavelet frame system for L2(K S ) can be easily constructed by 

using tensor products of 1-dimensional framelets (see e.g., [24l I28j ). 

In the discrete setting, we will use W £ M mxn with m > n to denote fast tensor product framelet 
decomposition and use W T to denote the fast reconstruction. Then by the unitary extension principle 
[44], we have W T W = /, i.e., u = W T Wu for any image u. We will further denote an L-level framelet 
decomposition of u as 

Wu = (. . . , Wiju, ...f for < I < L - l,j £ I, 

where I denotes the index set of all framelet bands and WijU £ R n . Under such notation, we have 
m = L X \I\ X n. We will also use a £ R m to denote the frame coefficients, i.e., a = Wu, where 

a = (. . . , aij, . . .) T , with otij = Wiju. 
More details on discrete algorithms of framelet transforms can be found in [28] . 
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Since tight wavelet frame systems are redundant systems (i.e., m > n), the representation of u in the frame 
domain is not unique. Therefore, there are mainly three formulations utilizing the sparseness of the frame 
coefficients, namely, analysis based approach, synthesis based approach, and balanced approach. Detailed 
and integrated descriptions of these three methods can be found in [28] . 

The wavelet frame based image processing started from |18[I19] for high-resolution image reconstructions, 
where the proposed algorithm was later analyzed in [7] . These work lead to the following balanced approach 

m 

(1.2) mm h\AW T a-f\\ 2 D + ^\\(I-WW T )a\\l + 

a£K m Z Z 

where p = 1 or 2, < ft < oo, Xij > is a scalar parameter, and || • \\d denotes the weighted ^2-norm with D 
positive definite. This formulation is referred to as the balanced approach because it balances the sparsity of 
the frame coefficient and the smoothness of the image. The balanced approach (|1.2[) was applied to various 
applications in [16l |2T] |48 ] [39] . 

When k — 0, only the sparsity of the frame coefficient is penalized. This is called the synthesis based 
approach, as the image is synthesized by the sparsest coefficient vector(see e.g., (26] 32, 33, 34] [35]). When 
K = +oo, only the sparsity of canonical wavelet frame coefficients, which corresponds to the smoothness of 
the underlying image, is penalized. For this case, problem (|1.2[) can be rewritten as 

(1.3) min h\Au-f\\ 2 D + 

jtfl" z 

This is called the analysis based approach, as the coefficient is in range of the analysis operator (see, for 
example, Q3] EJ ED] ) . 

Note that if we take p = 1 for the last term of (|1.2[) and (|1.3p . it is known as the anisotropic £i-norm 
of the frame coefficients, which is the case used for earlier frame based image restoration models. The case 
p = 2, called isotropic £i-norm of the frame coefficients, was proposed in [9] and was shown to be superior 
than anisotropic ^-norm. Therefore, we will choose p = 2 for our simulations. 

1.3. Motivations and Contributions. For most of the variational models and wavelet frame based ap- 
proaches, the choice of norm for the regularization term is the £i-norm. Taking wavelet frame based ap- 
proaches for example, the attempt of minimizing the ^i-norm of the frame coefficients is to increase their 
sparsity, which is the right thing to do since piecewise smooth functions like images can be sparsely ap- 
proximated by tight wavelet frames. Although the £i-norm of a vector does not directly correspond to its 
cardinality in contrast to £ "norm" , it can be regarded as a convex approximation to £q "norm" . Such 
approximation is also an excellent approximation for many cases. It was shown by |12] . which generalizes 
the exciting results of compressed sensing [T3] US] dH GH] , that for a given wavelet frame, if the operator A 
satisfies certain conditions, and if the unknown true image can be sparsely approximated by the given wavelet 
frame, one can robustly recover the unknown image by penalizing the ^i-norm of the frame coefficients. 

For image restoration, however, the conditions on A as required by [12] are not generally satisfied, which 
means penalizing £q "norm" and ^i-norm may produce different solutions. Although both the balanced 
approach (|1.2|) and analysis based approach (|1.3[) can generate restored images with very high quality, one 
natural question is whether using £q "norm" instead of ^i-norm can further improve the results. 

On the other hand, it was observed, in e.g., [28] (also see Figure [3] and Figure [4]), that balanced approach 
(|1.2[) generally generates images with sharper features like edges than the analysis based approach (|1.3p . 
because balanced approach emphasizes more on the sparsity of the frame coefficients. However, the recov- 
ered images from balanced approach usually contains more artifact (e.g., oscillations) than analysis based 
approach, because the regularization term of the analysis based approach has a direct link to the regularity 
of u (as proven by [5]) comparing to balanced approach. Although such trade-off can be controlled by the 
parameter k in the balanced approach (|1.2[) . it is not very easy to do in practice. Furthermore, when a large 
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k is chosen, some of the numerical algorithms solving (jl.2[) will converge slower than choosing a smaller k 



Since penalizing £i-norm of Wu ensures smoothness while not as much sparsity as balanced approach, 
we propose to penalize £q "norm" of Wu instead. Intuitively, this should provide us a balance between 
sharpness of the features and smoothness for the recovered images. The difficulty here is that £q minimization 
problems are generally hard to solve. Recently, penalty decomposition (PD) methods were proposed by [40] 
for a general £q minimization problem that can be used to solve our proposed model due to its generality. 
Computational results of [50] demonstrated that their methods generally outperform the existing methods 
for compressed sensing problems, sparse logistic regression and sparse inverse covariance selection problems 
in terms of quality of solutions and/or computational efficiency. This motivates us to adapt one of their PD 
methods to solve our proposed £q minimization problem. Same as proposed in }40| . the block coordinate 
descent (BCD) method is used to solve each penalty subproblem of the PD method. However, the convergence 
analysis of the BCD method was missing from [3D] when £ q "norm" appears in the objective function. Indeed, 
the convergence of the BCD method generally requires the continuity of the objective function as discussed 
in |52j . In addition, the BCD method for the optimization problem with the nonconvex objective function 
has only been proved to converge to a stationary point which is not a local minimizcr in general (see [52] for 
details). 

Contributions. The main contributions of this paper are summarized as follows. 

1) We propose a new wavelet frame based model for image restoration problems that penalizes the £o 
"norm" of the wavelet frame coefficients. Numerical simulations show that the PD method that 
solves the proposed model generates recovered images with better quality than those obtained by 
cither balanced approach and analysis based approach. 

2) Given the discontinuity and nonconvexity of the £q "norm" term in the objective function, we have 
proved some convergence results for the BCD method which is missing from the literature. 

We now leave the details of the model and algorithm to Section [2] and details of simulations to Section [3] 



We start by introducing some simple notations. The space of symmetric n x n matrices will be denoted 
by S n . If X <G S n is positive definite, we write X y 0. We denote by I the identity matrix, whose dimension 
should be clear from the context. Given an index set J C {1, . . . xj denotes the sub- vector formed by 
the entries of x indexed by J. For any real vector, || • ||o and || • 1 1 2 denote the cardinality (i.e., the number of 
nonzero entries) and the Euclidean norm of the vector, respectively. In addition, \\x\\d denotes the weighted 
^2-norm defined by \\x\\d = V x T Dx with D >~ 0. 

2.1. Model. We now propose the following optimization model for image restoration problems, 



where y is some convex subset of R n . Here we are using the multi-index i and denote (Wu)i (similarly for 
Aj) the value of Wu at a given pixel location within a certain level and band of wavelet frame transform. 
Comparing to the analysis based model, we are now penalizing the number of nonzero elements of Wu. As 
mentioned earlier that if we emphasize too much on the sparsity of the frame coefficients as in the balanced 
approach or synthesis based approach, the recovered image will contain artifacts, although features like edges 
will be sharp; if we emphasize too much on the regularity of u like in analysis based approach, features in the 
recovered images will be slightly blurred, although artifacts and noise will be nicely suppressed. Therefore, 
by penalizing the £0 "norm" of Wu as in (|2.1j) . we can indeed achieve a better balance between sharpness of 
features and smoothness of the recovered images. 

Given that the £q "norm" is an integer-valued, discontinuous and nonconvex function, problem (|2.3[) 
is generally hard to solve. Some algorithms proposed in the literature, e.g., iterative hard thresholding 
algorithms [5j [6j [38] , cannot be directly applied to the proposed model (|2 . 1 [) unless W = I. Recently, Lu and 



(see e.g., 0SH2EJ). 



2. Model and Algorithm 



(2.1) 
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Zhang |40j proposed a penalty decomposition (PD) method to solve the following general £ minimization 
problem: 

(2.2) minf(x) + v\\xj\\ 

for some v > controlling the sparsity of the solution, where A" is a closed convex set in R", / : M" — > K 
is a continuously diffcrcntiable function, and ||a;j|jo denotes the cardinality of the subvector formed by the 
entries of x indexed by J. In view of [40], we reformulate (|2.1j) as 

(2-3) min \\\Au - f\\% + ^ AiKllo 

u^y ,a=Wu i ^— ' 

i 

and then we can adapt the PD method of [JU] to tackle problem (|2.1[) directly. Same as proposed in [40] . 
the BCD method is used to solve each penalty subproblem of the PD method. In addition, we apply the 
non-monotone gradient projection method proposed in (4] to solve one of the subproblem in the BCD method. 



2.2. Algorithm for Problem (|2.3[) . In this section, we discuss how the PD method proposed in [30] solving 
(|2.2[) can be adapted to solve problem (|2.3[) . Letting x = (u\, . . . , u n , a\, . . . , a m ), J = {n + 1, . . . , n + m}, 
J = {1, . . . , n}, f(x) = ^||-Axj — /HI, and X = {x G ]R n + m : xj = Wxj and xj G y}, we can clearly see 
that the problem (|2.3p takes the same form as (|2.2j) . In addition, there obviously exists a feasible point 
(u foas ,a fcas ) for problem when ^ 7^ 0, i.e. there exist (u fcas , a foas ) such that Wu ic&s = a foas and 

u foas g -y j n p ar ticular, we can choose (u feas , a foas ) = (0,0), which is the choice we make for our numerical 
studies. We now discuss the implementation details of the PD method when solving the proposed wavelet 
frame based model (|2.3|) . 

Given a penalty parameter g > 0, the associated quadratic penalty function for (|2.3[) is defined as 

(2.4) p e (u,a) := \\\Au - f\\ 2 D + ]T A;K|| + ^\\Wu- a\\l 

i 

Then we have the following PD method for problem (|2.3[) where each penalty subproblem is approximately 
solved by a BCD method (see [3D] for details). 

Penalty Decomposition (PD) Method for (|2.3I) : 

Let go > 0, 5 > 1 be given. Choose an arbitrary a 0,0 £ W 71 and a constant T such that T > max{i||Au fcas — 
f\\D+EiM\ai™h,KMueyP eo (u,a ' )}. Set k = 0. 

1) Set q = and apply the BCD method to find an approximate solution (u k ,a k ) G y x M m for the 
penalty subproblem 

(2.5) min{p efc («, a) : u G y, a G R m } 

by performing steps la)-ld): 
la) Solve u k ' q+1 G Argminp efc (u,a fc ' 9 ). 

lb) Solve a k < q+1 G Arg min p Bk (u k ' g+1 , a). 

lc) If (u k ' q+1 ,a k > q+1 ) satisfies the stopping criteriaof the BCD method, set (u k ,a k ) := (u k ' q+1 , a k ' q+1 ) 

and go to step 2). 
Id) Otherwise, set q <s— q + 1 and go to step la). 

2) If (u k , a k ) satisfies the stopping criteria of the PD method, stop and output u k . Otherwise, setgk+i '■= 
Sgk- 

3) If minp e (it,a fc ) > T, set a k+lfl := a foas . Otherwise, set a k+lfl := a k . 

uey 

4) Set k <— k + 1 and go to step 1). 

end 
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Remark 2.1. In the practical implementation, we terminate the inner iterations of the BCD method based 
on the relative progress of p gk (u k ' q ,a k,q ) which can be described as follows: 

\p ek {u k > q ,a k ' q )-p ek (u k ' q +\a k ^)\ 
[ ' ' m&x(\p ek (u k > q + 1 ,a k ' q+1 )\,l) " z ' 

Moreover, we terminate the outer iterations of the PD method once 

(2 7) \\Wu k -a k \\ 2 < £o 

max(|p efc (u fe ,a fc )|,l) ' 

Next we discuss how to solve two subproblems arising in step la) and lb) of the PD method. 

2.2.1. The BCD subproblem in step la). The BCD subproblcm in step la) is in the form of 

(2.8) mm\(u,Qu) - {c,u) 

for some Q >~ and c € K™. Obviously, when y = W\ problem (|2.8p is an unconstrained quadratic 
programming problem that can be solved by the conjugate gradient method. Nevertheless, the pixel values 
of an image are usually bounded. For example, the pixel values of a CT image should be always greater 
than or equal to zero and the pixel values of a grayscale image is between [0, 255]. Then the corresponding 
y of these two examples are y = {x e W : Xi > lb Vi = 1, . . . , n} with lb = and y = {x € W 1 : lb < x^ < 
ub Wi = 1, . . . , n} with lb = and ub — 255. To solve these types of the constrained quadratic programming 
problems, we apply the nonmonotonc projected gradient method proposed in [1] and terminate it using the 
duality gap and dual feasibility conditions (if necessary) . 

For y = {x e R" : x, > lb Vi = 1, . . . , n}, given a Lagrangian multiplier f3 E M™, the associated Lagrangian 
dual function of (12.81) can be written as: 



L(u,f3) = w(u) + f3 T (lb-u), 

where w(u) = ^(u,Qu) — (c,u). Based on the Karush-Kuhn- Tucker (KKT) conditions, for an optimal 
solution u* of (|2.8p . there exists a Lagrangian multiplier /3* such that 

Qu* -c-0* = 0, 
P* > Vi = l,...,n, 
{lb-u*)j3* = Vi = l,...,n. 

Then at the sth iteration of the projected gradient method, we let /3 s = Qu s — c. As {u s } approaches the 
solution u* of (|2.8p . {/3 s } approaches the Lagrangian multiplier j3* and the corresponding duality gap at 
each iteration is given by X)"=i Pi (lb — uf). Therefore, we terminate the projected gradient method when 



max(|w(u s )|, 1) max(|j/3 s ||2, 1) 

for some tolerances e_o, ef > 0. 

For y = {x G R" : lb < Xi < ub Vi = 1, . . . , n}, given Lagrangian multipliers /3, 7 € IR™, the associated 
Lagrangian function of (|2.8p can be written as: 

L(u,f3,j) = w(u) + f3 T (lb -u) + 7 T (w - ub), 

where w(u) is defined as above. Based on the KKT conditions, for an optimal solution u* of (|2.8[) . there 
exist Lagrangian multipliers (3* and 7* such that 

Qu* -c- 13* =0, 
/3? >0 Vi = l,...,n, 
7*>0Vz = l,...,n, 
(I6-«f)^ = 0Vi = l,...,n, 
« - u6)7* = Vi = l,...,n. 

Then at the sth iteration of the projected gradient method, we let /3 s = ma,x(Qu s — c, 0) and 7 s = 
— mm(Qu s — c, 0). As {u s } approaches the solution u* of (|2.8[) . {/3 s } and {7 s } approach Lagrangian mul- 
tipliers /3* and 7*. In addition, the corresponding duality gap at each iteration is given by J2i=i(Pi (lb — 
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u*)+7*(uf — ub)) and the duality feasibility is automatically satisfied. Therefore, we terminate the projected 
gradient method when 

| E?=i (Pi ( lb ~ u i) + 7i K ~ub))\ 



max(|iu(it s )|, 1) 

for some tolerance t£> > 0. 



<e D 



2.2.2. The BCD subproblem in step lb). For \i > 0, g > and c G R m , the BCD subproblem in step lb) is 
in the form of 



mm 



SkU 1 1 1 1 + ^^("i ~ 



By [40l Proposition 2.2] (see also [U [5] for example), the solutions of the above subproblem forms the 
following set: 

/ 2A ■ 

(2.9) a* e Hx (c) with A* := J — for all i, 

| S 

where -ff 7 (-) denotes a component-wise hard thresholding operator with threshold 7: 

( if la* I < 7i, 

(2.10) [iCy(»)]* = S {0:^} ifkil=7i, 

[ Xi if \xi\ > 7i. 

Note that _ff 7 is defined as a set-valued mapping [331 Chapter 5] which is different (only when |xi| = 7^) 
from the conventional definition of hard thresholding operator. 

2.3. Convergence of the BCD method. In this subsection, we establish some convergence results re- 
garding the inner iterations, i.e., Step 1), of the PD method. In particular, we will show that the fixed point 
of the BCD method is a local minimizcr of (|2.5|) . Moreover, under certain conditions, we prove that the 
sequence {(u k ' q , a k,q )} generated by the BCD method converges and the limit is a local minimizcr of (|2.5j) . 

For convenience of presentation, we omit the index k from (|2.5j) and consider the BCD method for solving 
the following problem: 

(2.11) mm{p e (u,a) : uey, aE R m }. 

Without loss of generality, we assume that D = I. We now relabel and simplify the BCD method described 
in step la)-lc) in the PD method as follows. 



U1+ 1 = argmin ue3; \\\Au - ff 2 + §\\Wu - ofl\\l 
a" +1 G Argmin Q £i \i\\<*i\\o + f II" - Wu q+1 » 2 



(2.12) 

We first show that the fixed point of the above BCD method is a local minimizer of (12.51) . 

Theorem 2.2. Given a fixed point of the BCD method (|2.12[) . denoted as (u*,a*), then (u*,a*) is a local 
minimizer of p g (u,a). 

Proof. We first note that the first subproblem of (|2.12p gives us 

(2.13) (A T (Au* -f) + gW T (Wu* -a*),v-u*} > for all v G y. 
By applying (12.91) . the second subproblem of (|2.12[) leads to: 

(2.14) a* G H- x {Wu*) . 
Define index sets 

r n := {i : al = 0} and T ± := {i : a* ^ 0}. 
It then follows from ([2TT4]) and (|2~TU1) that 

f2151 f|(W)i|<Ai forieTo 

\(Wu*)i = al fortGTi, 

where (Wu*)i denotes ith entry of Wu*. 
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Consider a small deformation vector {dh,dg) such that u* + dh £ y. Using (|2.13[) . we have 
Pe (u* + dh, a* + dg) = ^\\Au* + Adh - ff 2 + ^ Xi\\(a* + dg)i\\ + |||a* + dg - W{u* + dh)\\ 2 2 

i 

= \\\Au* - f\\l + {Adh, An* -f) + \\\Adh\\\ + £ Xi\\(a* + 5 )<|| o 

i 

+ - Wu*\\l + g(a* - Wu*,dg ~Wdh) + |||<9g - Wdh\\% 
= \\\Au* - f\\l + J2 \i\\(a* + dgUo + § \\a* Wu*f 2 + \\\Adh\\\ 

i 

+ (dh,A T (Au* - /) + gW T (Wu* - a*)) + g(dg,a* - Wu*) + |||<9g - Wdhf 2 



> \\\Au* - f\\l + X M a * + d 9)ih + f K - Wu*\ 



*||2 
2 



i 

+(<9/i, A T (Au* - /) + gW T {Wu* -a*)) + g(dg, a* - Wu*) 
(By (E33D) > ill^-ZII^ + ^AilK^ + ^iHo + lll^-^lll + ^a*-^*) 

i 

= \\\Au* - f\\l + f ||a* - + ( A *K* + ^llo + odgM - {Wu*)., 

i 

Splitting the summation in the last equation with respect to index sets T$ and T\ and using (|2.15|) . we have 

p e {u*+dh, a*+dg) > ±\\Au* - /||| + - Wu*f 2 + £ (Ai||^|| - ^. 9i (^*) i ) + £ A;K + 5i ||„. 

ier ieri 

Notice that when \dg^ is small enough, we then have 

K + 0flillo = ||a;i|o fori GIV 

Therefore, we have 



i \ a i 



p ( M * + dh, a* + dg) > -\\Au* - /||| + Ilia* - W«H + ]T (Ai||9 5i || - gd 9i {Wu*)i) + ]T A 

ier ieri 

= Pe {u*,a*)+J2 (XiWdgAlo-gdg^Wu*)^. 
ier 

We now show that, for i £ Tq and ||<9<?|| small enough, 

(2.16) AiMo - Qd gi {Wu*)i > 0. 

For the indices i such that Xi = 0, first inequality of (|2.15l) implies that {Wu*)i = and hence (|2.1(iD 
holds. Therefore, we only need to consider indices i E Tq such that ^ 0. Then obviously as long as 
\dgA < e \rwa*)-\ ' we wm nave (|2.16p hold. We now conclude that there exists e > such that for all {dh, dg) 
satisfying maxdlS/iljoo, ||<9g||oo) < £, we have p e {u* + dh, a* + dg) > p g {u*,a*). □ 

We next show that under some suitable assumptions, the sequence {{u q , a q )} generated by (|2.12[) converges 
to a fixed point of the BCD method. 

Theorem 2.3. Assume that y = W l and A T A y 0. Let {{u q ,a q )} be the sequence generated by the BCD 
method described in (|2.12[) . Then, the sequence {{u q ,a q )} is bounded. Furthermore, any cluster point of the 
sequence {{u q ,a q )} is a fixed point of (|2.12j) . 

Proof. In view of y = W 1 and the optimality condition of the first subproblem of (|2.12[) , one can see that 

(2.17) u q+1 = {A T A + giy 1 A T f + g{A T A + e I)- l W T a q . 
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Let x := (A T A + gI)~ 1 A T f, P := g(A T A + qI)~ x , equation (j2~TT)) can be rewritten as 

(2.18) u q+1 = x + PW T a q . 

Moreover, by the assumption A T A y 0, we have -< P -< I. 

Using (|2.18p and (|2.10[) . we observe from the second subproblem of (|2.12p that 

(2.19) a q+1 G H x {Wu q+1 ) = H x (Wx + WPW T a q ) . 
Let Q := I - WPW T , then (f2~TT)l) can be rewritten as 

(2.20) a q+1 G H x (oP + Wx- Qa q ) . 

In addition, from W T W — I we can easily show that -< Q ^ I. 

Let F(a,p) := |(a,Qa) - (Wx, a) + J2i AiKUo - U a ~ P> Q( a ~ P)) + h\\ a ~ PWl where A = ^. Then 
we have 



(2.21) Argmin Q F(a,a 9 ) = Argmin a -||a - (a q + Wx- Qa q )\\ 2 2 + ^Xi 



\Oti o- 



In view of equation (|2.20[) and (|2.21[) and the definition of the hard thresholding operator, we can easily 
observe that a q+1 G Argmin a .F'(a, a q ). By following similar arguments as in [5J Lemma 1, Lemma D.l], we 
have 

F(a q+ \a q+1 ) < F{a q+ \a q+1 ) + ^\\a q+1 ~a q \\l-^(a q+1 -a q ,Q{a q+1 -a q )) 

= F{a q+ \a q ) 
< F(a q ,a q ), 

which leads to 

\\a q+1 - a q \\ 2 2 - (a q+1 - a q ,Q(a q+1 - a q )) < 2F(a q ,a q ) - 2F(a q+1 , a q+1 ). 
Since P >- 0, we have 

\\W T {a q+1 -a q )\\l < ^r(W T {a q+1 - a q ), PW T {a q+1 - a q )) 

= ^(a q+1 -a q ,(I-Q)(a q+1 -a q )) 

= 7^ {\\a q+1 - a q \\j - (a q+1 - ofl, Q(a q+1 - a q ))) 

< l-F{cfl,cP)- A F(a q +\a q+1 ) 

for some G\ > 0. Telescoping on the above inequality and using the fact that ^ 4 Ai||ai||o > 0, we have 

N 11 

Y,W\a q ^ - a q )\\\ < ±F{a\oP)-±F{a N +\a N ^) 

< A (V(a°,a°) - (\(a N +\Qa N+1 ) - <I^,a w+1 )) 

< 2 ( F (a ,a°)-K), 

where if is the optimal value of min{i(y, Qy) — (Wx, y)}. Since Q >- 0, we have if > — oo. Then the last 

y 

inequality implies that limq_>oo |jM /T (a 9+1 — a 9 )||2 — > 0. 
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By using (|2.18[) and P -< I, we see that 



-W T a q+1 \ 



2 — 



> 



> 



« 9 )l|2 



x + PW T a q - W T a 9+1 + W T a q - W T a q \ 
x+(P- I)W T a q - W T (a q+1 
x+(P- I)W T a q \\ 2 - \\W T (a q+1 
(I - P)W T a q -x\\ 2 - \\W T (a q+1 - a q )\\ 2 
(I - P)W T a q \\ 2 - \\x\\ 2 - \\W T (a q+1 - a q )\\ 2 



<x q )h 



> C 2 \\W T a q \\ 2 - \\x\\ 2 - \\W T {a q+1 - a q )\\ 2 

for some C 2 > 0. Then by rearranging the above inequality and using the fact W T W = I, we have 

1 



\W'a q \\ 2 < ^ r (\\ u q + 1 -W T a q+1 \\ 2 + \\x\\ 2 + \\W T (a q+1 -a q )\\ 2 ) 



1 
1 



(\\W T {Wu 



9+1 _ ™9+l 



a q+1 )\\ 2 + \\x\\ 2 + \\W 1 (a 



9+1 



« 9 )l| 2 ) 



< 7^(\\Wu q+1 - a q+1 \\ 2 + \\x\\ 2 + \\W T (a q+1 - a q )\\ 2 ). 

^•2 

By the definition of the hard thresholding operator and (|2.19[) . we can easily see that — a 9+1 ||2 is 

bounded. In addition, notice that ||x||2 is a constant and lim^oo ||W /T (a <?+1 — ct g ) H2 — ► 0. Thus ||M^ T a 9 ||2 is 
also bounded. By using (|2 . 1 8|) and the definition of the hard thresholding operator again, we can immediately 
see that both {u q+1 } and {a 9+1 } are bounded as well. 

Suppose that (u*,a*) is a cluster point of the sequence {(u q ,a q )}. Therefore, there exists a subsequence 
{{u qi , a qi )}i converging to (u*,a*). Using (|2.19p and the definition of the hard thresholding operator, we 
can observe that 

a* = lim a qi ^ e Hx{ lim Wu^ 1 ) = H~ x (Wu*). 

I— )-oo I— >CJO 

In addition, it follows from ()2.17|) that 

u* = (A T A + gI)- 1 A T '/ + g(A T A + gI)- 1 W T a*. 
In view of the above two relations, one can immediately conclude that {(u*,a*)} is a fixed point of (|2.12|) . □ 



In the view of Theorems 12.21 [2~3l and under some suitable assumptions, we can easily observe the following 
convergence of the BCD method. 

Theorem 2.4. Assume that y = R" and A T A >- 0. Then, the sequence {(u q ,a q )} generated by the BCD 
method has at least one cluster point. Furthermore, any cluster point of the sequence {(u q ,a q )} is a local 
minimizer of (|2 . 1 1 1) . 

For the PD method itself, similar arguments as in the proof of [40) Theorem 3.2] will lead to that every 
accumulation point of the sequence {{u k , a k )} is a feasible point of (|2.3[) . Although it is not clear whether the 
accumulation point is a local minimizer of (|2.3j) . our numerical results show that the solutions obtained by 
the PD method are superior than those obtained by the balanced approach and the analysis based approach. 



3. Numerical results 

In this section, we conduct numerical experiments to test the performance of the PD method for problem 
(|2.3[) presented in Section [5] and compare the results with the balanced approach (|1.2p and the analysis based 
approach (|1.3[) . We use the accelerated proximal gradient (APG) algorithm [48] (see also [3]) to solve the 
balanced approach; and we use the split Bregman algorithm [37l QT] to solve the analysis based approach. 

For APG algorithm that solves balanced approach (|1.2|) . we shall adopt the following stopping criteria: 

. ( \\a k - a^h UW T a k - f\\ D 
mm \max{l, K|| 2 }' ||/||2 
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Table 1. Comparisons: CT image reconstruction 





Balanced approach 


Analysis based approach 


PD method 


Time 
PSNR 


56.0 
56.06 


204.8 
59.90 


147.6 
60.22 



For split Bregman algorithm that solves the analysis based approach (|1.3[) . we shall use the following stopping 
criteria: 

\\Wu k+1 - a k+l \\ 2 

11/112 - £S ' 

Throughout this section, the codes of all the algorithms are written in MATLAB and all computations 
below are performed on a workstation with Intel Xeon E5410 CPU (2.33GHz) and 8GB RAM running Red 
Hat Enterprise Linux (kernel 2.6.18). If not specified, the piecewise linear B-spline framelets constructed by 
[4"4] are used in all the numerical experiments. We also take D = I for all three methods for simplicity. For the 
PD method, we choose ej = 10~ 4 and to = 10~ 3 and set a ' , a and u s to be zero vectors. In addition, 
we choose[4j Algorithm 2.2] and set M = 20, to = 5 x 10~ 5 and ep — 10 -4 (if necessary) for the projected 
gradient method applied to one of subproblems arising in the BCD method (i.e., step la) in the PD method). 

3.1. Experiments on CT Image Reconstruction. In this subsection, we apply the PD method stated 
in Section [2] to solve problem (|2.3[) on CT images and compare the results with the balanced approach (|1.2[) 
and the analysis based approach (|1.3j) . The matrix A in (|1.1[) is taken to be a projection matrix based on 
fan-beam scanning geometry using Siddon's algorithm [49], and rj is generated from a zero mean Gaussian 
distribution with variance a = 0.01||/||oo- In addition, we pick level of framelet decomposition to be 4 for the 
best quality of the reconstructed images. For balanced approach, we set k = 2 and take ep = 1.5 x 10~ 2 for 
the stopping criteria of the APG algorithm. We set eg = 10~ 5 for the stopping criteria of the split Bregman 
algorithm when solving the analysis based approach. Moreover, we take y = {x <G M. n : Xi > Vi = 1, . . . , n} 
for model ()2.3[) . and take 5 = 10 and go = 10 for the PD method. To measure quality of the restored image, 
we use the PSNR value defined by 

PSNR:=-201og 10 ^*, 

n 

where u and u are the original and restored images respectively, and n is total number of pixels in u. 

Table [1] summarizes the results of all three models when applying to the CT image restoration problem 
and the corresponding images and their zoom-in views are shown in Figure [T] and Figure O In Table [TJ 
the CPU time (in seconds) and PSNR values of all three methods are given in the first and second row, 
respectively. In order to fairly compare the results, we have tuned the parameter A to achieve the best quality 
of the restoration images for each individual method. We observe that based on the PSNR values listed in 
Table Q] the analysis based approach and the PD method obviously achieve better restoration results than 
the balanced approach. Nevertheless, the APG algorithm for the balanced approach is the fastest algorithm 
in this experiment. In addition, the PD method is faster and achieves larger PSNR than the split Bergman 
algorithm for the analysis based approach. Moreover, we can observe from Figure [5] that the edges are 
recovered better by the PD method and the balanced approach. 

3.2. Experiments on image deconvolution. In this subsection, we apply the PD method stated in 
Section [5] to solve problem (|2.3[) on image deblurring problems and compare the results with the balanced 
approach (|1.2[) and the analysis based approach (|1.3[) . The matrix A in (|2.3[) is taken to be a convolution ma- 
trix with corresponding kernel a Gaussian function (generated in MATLAB by "fspccial('gaussian',9,1.5);") 
and 77 is generated from a zero mean Gaussian distribution with variance a . If not specified, we choose 
(7 = 3 in our experiments. In addition, we pick level of framelet decomposition to be 4 for the best qual- 
ity of the reconstructed images. We set k = 1 for balanced approach and choose both ep and eg to be 
10~ 4 for the stopping criteria of both APG algorithm and the split Bregman algorithm. Moreover, we set 
y = {x e R" : < Xi < 255 Vi = 1,. . .,n} for model ([2l3j) . and take S = 10 and g = 10~ 3 for the PD 
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Figure 1. CT image reconstruction. Images from left to right are: original CT image, 
reconstructed image by balanced approach, reconstructed image by analysis based approach 
and reconstructed image by PD method. 




Figure 2. Zoom- in views of the CT image reconstruction. Images from left to right are: 
original CT image, reconstructed image by balanced approach, reconstructed image by anal- 
ysis based approach and reconstructed image by PD method. 



method. To measure quality of restored image, we use the PSNR value defined by 



PSNR := -201og 10 



« a 



255?i 



We first test all three methods on twelve different images by using piecewise linear wavelet and summarize 
the results in Table [2j The names and sizes of images are listed in the first two columns. The CPU time (in 
seconds) and PSNR values of all three methods are given in the rest six columns. In addition, the zoom-in 
views of original images, observed images and recovered images are shown in Figure [3]|4j In order to fairly 
compare the results, we have tuned the parameter A to achieve the best quality of the restoration images for 
each individual method and each given image. 

We first observe that in Table [3J the PSNR values obtained by the PD method are generally better than 
those obtained by other two approaches. Although for some of the images (i.e. "Downhill" , "Bridge" , "Duck" 
and "Barbara"), the PSNR values obtained by the PD methods are comparable to those of balanced and 
analysis based approaches, the quality of the restored images can not only be judged by their PSNR values. 
Indeed, the zoom- in views of the recovered images in Figure [3] and Figure [4] show that for all tested images, 
the PD method produces visually superior results than the other two approaches in terms of both sharpness 
of edges and smoothness of regions away from edges. Takeing the image "Barbara" as an example, the PSNR 
value of the PD method is only slightly greater than that obtained by the other two approaches. However, 
the zoom-in views of "Barbara" in Figure [4] show that the face of Barbara and the textures on her scarf 
are better recovered by the PD method than the other two approaches. This confirms the observation that 
penalizing £q "norm" of Wu should provide good balance between sharpness of features and smoothness of 
the reconstructed images. We finally note that the PD method is slower than other two approaches in these 
experiments but the processing time of the PD method is still acceptable. 

We next compare all three methods on "portrait I" image by using three different tight wavelet frame 
systems, i.e., Haar framelets, piecewise linear framelets and piecewise cubic framelets constructed by [44] . 
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Table 2. Comparisons: image deconvolution 







Balanced 


approach 


Analysis 


based approach 


PD method 


Name 


Size 


Time 


PSNR 


Time 


PSNR 


Time 


PSNR 


Downhill 


256 


12.5 


27.24 


6.1 


27.36 


29.5 


27.35 


Cameraman 


256 


18.2 


26.65 


7.0 


26.73 


31.1 


27.21 


Bridge 


256 


14.5 


25.40 


5.1 


25.46 


33.0 


25.44 


Pepper 


256 


21.6 


26.82 


7.5 


26.63 


32.1 


27.29 


Clock 


256 


17.3 


29.42 


19.9 


29.48 


22.3 


29.86 


Portrait I 


256 


32.7 


33.93 


19.3 


33.98 


27.1 


35.44 


Duck 


464 


30.6 


31.00 


16.1 


31.11 


72.5 


31.09 


Barbara 


512 


38.8 


24.62 


12.3 


24.62 


77.4 


24.69 


Aircraft 


512 


55.9 


30.75 


35.1 


30.81 


67.5 


31.29 


Couple 


512 


91.4 


28.40 


41.5 


28.14 


139.1 


29.32 


Portrait II 


512 


45.2 


30.23 


22.1 


30.20 


48.9 


30.90 


Lena 


516 


89.3 


12.91 


31.0 


12.51 


67.0 


13.45 



Table 3. Comparisons among different wavelet representations 





Balanced 


approach 


Analysis 


based approach 


PD method 


Wavelets 


Time 


PSNR 


Time 


PSNR 


Time PSNR 


Haar 


17.9 


33.63 


20.2 


33.80 


24.3 34.68 


Piecewise linear 


32.7 


33.93 


22.3 


33.98 


27.1 35.44 


Piecewise cubic 


61.0 


33.95 


37.3 


34.00 


37.8 35.20 



Table 4. Comparisons among different noise levels for image "Portrait I 





Balanced 


approach 


Analysis 


based approach 


PD method 


Variances of noises 


Time 


PSNR 


Time 


PSNR 


Time PSNR 


a = 3 


32.7 


33.93 


22.3 


33.98 


27.1 35.44 


cr = 5 


23.7 


32.84 


19.4 


32.89 


27.2 34.48 


cr = 7 


19.6 


32.11 


25.0 


32.14 


29.7 33.69 



We summarize the results in Table |3l The names of three wavelets are listed in the first column. The CPU 
time (in seconds) and PSNR values of all three methods are given in the rest six columns. In Table |3l we can 
see that the quality of the restored images by using the piecewise linear framelets and the piecewise cubic 
framelets is better than that by using the Haar framelets. In addition, all three methods are generally faster 
when using Haar framelets and slower when using piecewise cubic framelets. Overall, all three approaches 
when using the piecewise linear have balanced performance in terms of time and quality (i.e., the PSNR 
value). Finally, we observe that the PD method consistently achieves the best quality of restored images 
among all the approaches for all three different tight wavelet frame systems. 

Finally, we test how different noise levels affect the restored images obtained from all the three methods. 
We choose three different noise levels (i.e., a — 3, 5, 7) for image "Portrait I", and test all the three methods 
by using piecewise linear framelets. We summarize the results in Table H The variances of noises arc listed 
in the first column. The CPU time (in seconds) and PSNR values of all three methods are given in the rest 
six columns. We observe that the qualities of the restored images by all three methods degrade when the 
noise level increases. Nevertheless, the PD method still outperforms other two methods. 

4. Conclusion 

In this paper, we proposed a wavelet frame based £q minimization model, which is motivated by the 
analysis based approach and balanced approach. The penalty decomposition (PD) method of [3D] was used 
to solve the proposed optimization problem. Numerical results showed that the proposed model solved by the 
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PD method can generate images with better quality than those obtained by either analysis based approach 
or balanced approach in terms of restoring sharp features like edges as well as maintaining smoothness of 
the recovered images. Convergence analysis of the sub-iterations in the PD method was also provided. 
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Figure 3. Zoom- in to the texture part of "downhill", "cameraman", "bridge", "pepper", 
"clock" , and "portrait I" . Image from left to right are: original image, observed image, 
results of the balanced approach, results of the analysis based approach and results of the 
PD method. 
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FIGURE 4. Zoom- in to the texture part of "duck" , "barbara" , "aircraft" , "couple" , "portrait 
II" and "lena" . Image from left to right are: original image, observed image, results of the 
balanced approach, results of the analysis based approach and results of the PD method. 



